首页> 外文OA文献 >Statistically-Consistent k-mer Methods for Phylogenetic Tree Reconstruction
【2h】

Statistically-Consistent k-mer Methods for Phylogenetic Tree Reconstruction

机译:统计学上一致的k-mer方法用于系统发育树   重建

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Frequencies of $k$-mers in sequences are sometimes used as a basis forinferring phylogenetic trees without first obtaining a multiple sequencealignment. We show that a standard approach of using the squared-Euclideandistance between $k$-mer vectors to approximate a tree metric can bestatistically inconsistent. To remedy this, we derive model-based distancecorrections for orthologous sequences without gaps, which lead to consistenttree inference. The identifiability of model parameters from $k$-merfrequencies is also studied. Finally, we report simulations showing thecorrected distance out-performs many other $k$-mer methods, even when sequencesare generated with an insertion and deletion process. These results haveimplications for multiple sequence alignment as well, since $k$-mer methods areusually the first step in constructing a guide tree for such algorithms.
机译:序列中$ k $ -mers的频率有时被用作推断系统发生树的基础,而无需首先获得多重序列比对。我们表明,在$ k $ -mer向量之间使用平方欧几里得距离来近似树度量的标准方法可能在统计上不一致。为了解决这个问题,我们导出了没有间隙的直系同源序列的基于模型的距离校正,这导致了一致的树推断。还研究了从$ k $ -merfrequency的模型参数的可识别性。最后,我们报告的模拟结果显示,即使使用插入和删除过程生成序列,校正后的距离也优于其他许多方法。这些结果也具有多重序列比对的意义,因为$ k $ -mer方法通常是构建用于此类算法的指导树的第一步。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号